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Lattice QCD with light dynamical quarks* 

qq+q Collaboration 

F. Farchioni a , C. Gebert b , I. Montvay b , L. Scorzato b . 

a Institut fur Theoretische Physik, Universitat Minister, Wilhclm-Klcmm-Str. 9, 
D-48149 Miinster, Germany 

b Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, D-22603 Hamburg, Germany 

We report on the simulation of QCD with light dynamical quarks using the two-step multi-boson (TSMB) 
algorithm. In an exploratory study with two flavours of quarks at lattice spacing about 0.27 fm and with quark 
mass down to one sixth of the strange quark mass eigenvalue spectra and autocorrelations have been studied. 
Here we present results on the volume dependence as well as tests of possible algorithmic improvements. 



1. INTRODUCTION 

Predicting the low energy constants of the chi- 
ral effective Lagrangian from first principles in 
QCD is an exciting challenge for lattice gauge 
theories. This question has attracted a consider- 
able attention in the last few years [[j]J^||. How- 
ever, existing unqucnched simulations are typ- 
ically done in a region where the light quarks 
are not light enough, in most cases - especially 
with Wilson-type quarks - with two light quark 
flavours (it and d) having masses larger than 
half of the strange quark mass (m u d > \m s ). 
Since the quark mass dependence in the chiral 
Lagrangian is explicit, it is not necessary to per- 
form simulations at the physical value of the u- 
and (i-quark masses, which remains a too hard 
task for the forseeable future. Nevertheless, in 
order to keep the systematic errors under con- 
trol, three dynamical quarks should be simulated 
and, according to 0, the masses of the light ones 
should satisfy at least m u d < \m 8 . 

Simulating light dynamical quarks is a diffi- 
cult task for numerical computations because all 
known algorithms for QCD have a substantial 
slowing down towards small quark masses. The 
present status has been summarized at the Berlin 
lattice conference [ 5|. Inspired by that discussion, 
in a recent paper [6] we parametrized the cost of 



an unquenched QCD simulation in two different 
ways: 

C = F (r „ ra .)-(§)*'(^)*- , (1) 



Cu = Ftj 



"Combining the contributions of F. Farchioni and L. 
Scorzato. 



Here rg is the Sommer scale parameter j7| , m w the 
pion mass, L the lattice extension and a the lat- 
tice spacing. The powers z^^p^a an d the over- 
all constants F, Fjj are empirically determined. 
The value of the constants F, Fjj depends on the 
definition of "cost" . Of course, when the quark 
masses are very light and the decay p — > 7T7t is 
possible, the first parametrization is preferable. 
Typical values of the parameters can be taken 
from Ukawa's contribution ||: F\j — 5.9 • 10 6 
flop, z^p = 6, Zl = 5, z a = 2. Other estimates 
are in reasonable agreement. However most of the 
simulations, whereon such estimates are based, 
were perfomed at quite large quark masses. This 
is a reason of concern because, as experienced al- 
ready by several collaborations H , at small quark 
masses new phenomena appear, especially for al- 
gorithms based on molecular dynamics integra- 
tors. 

Given the compelling need of simulations at 
lighter quarks, we performed algorithmic studies 
in the regime of very light quark masses (though 
on very rough lattices) using the two-step 
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multi-boson algorithm |9|]. Here we give an up- 
date of the cost analysis. We also include a pre- 
liminary study of the volume dependence and a 
test of a possible improvement of the algorithm. 

2. COST OF THE SIMULATIONS 

2.1. Algorithm 

Let us first briefly summarize the main features 
of TSMB. The quark determinant of Nf degener- 
ate flavours is represented as 

Nf 1 

(3) 



det(Q) 



det P^{Q 2 ) detP^ ; (Q 2 ) 



(2), 



Q is the hermitean Wilson-Dirac fermion matrix. 
The polynomials P^ and P^ 2 -* satisfy 

lim pV>(x)P$(x) = x- N f/ 2 , x e [e,A] . (4) 

ri2— >oo 

The interval [e, A] covers the spectrum of Q 2 on 
a typical gauge configuration. The first polyno- 
mial PW of order n\ is a crude approximation 
and is realized by the multi-boson representation. 
The second polynomial P^ of order ri2 S> n\ 
gives a better approximation. It is taken into ac- 
count in the updates by a global noisy Metropolis 
correction step. Since for fixed n-i (and outside 
the interval [e, A]) the approximation pWpt 2 ) is 
not exact, a final correction step is performed by 
reweighting the gauge configurations which are 
considered for the evaluation of the expectation 
values. This is done with a sufficiently high or- 

(2) 

der polynomial P„ 4 . For more details about the 
algorithm in this context we refer to S . 

2.2. Parameters and the definition of cost 

The Monte Carlo simulations have been done 
near the N t = 4 thermodynamical cross-over line. 
We tuned the gauge coupling {3 and the hopping 
parameter k in order to explore a range of the 
quark mass going from 2m s to ^rn s while keeping 
ro/a roughly constant between r^/a = 1.7 and 
Tq /a = 1.9, which implies a ~ 0.27 fm. Most of 
the analysis is performed on lattices of size 8 3 • 16 
implying a physical lattice extension L ~ 2.2 fm. 
For the study of finite volume effects we used 12 3 ■ 
24 and 16 4 lattices. 

The cost of a simulation can be measured in 
units of the floating point operations (flop) which 



are necessary to produce a new decorrelated con- 
figuration (starting from a themalized configura- 
tion). The decorrelation quantified by the inte- 
grated autocorrelation length Ti nt depends on the 
quantity whose autocorrelation is observed. We 
consider the average plaquette, the lowest eigen- 
value of the fermion matrix, the pion correlator 
at some fixed distance and the pion mass. In- 
stead of the number of flops the integrated auto- 
correlations can also be quoted by the necessary 
number of fermion-matrix vector multiplications 
(MVM's). We always count mutiplications by the 
fermion matrix Q. In this way the dependence on 
the computer architecture is reduced. For TSMB 
we have the following approximate formula for 
the total amount of MVM's needed for one up- 
date cycle: 

N ~ 6(n 1 N< s , + N u ) + 2(n 2 + n 3 )N c + I G F G .(5) 

Here is the number of local bosonic sweeps 
per update cycle, Nu the number of local gauge 
sweeps, Nc the number global Metropolis accept- 
reject correction steps, and Iq and F G give the 
number of MVM's and frequency of the global 
quasi heatbath. (713 is the order of an auxiliary 
polynomial P (3) [|.) Of course, the number of 
MVM's can easily be converted to flops by noting 
that the number of flops per lattice point (in our 
code) is 1344. The quantity in (||) is reported for 
the runs considered here in the columns 7 and 5 
of tables @ and |^ respectively. 

2.3. Results about the costs 

In Q power fits have been given for the depen- 
dence of the integrated autocorrelations on the 
quark mass parameter M r = (rom^) 2 . For in- 
stance, in case of the average plaquette a good fit 
is 

plaq _ M z plaq 
1 int ~ c plaq+"r , 

c plaq = 7.92(68) , z plaq = -2.02(10) , (6) 

if the cost is counted in MMVM's (= 10 6 fermion- 
matrix vector multiplications). This corresponds 



tO Zt, 



4 in (if). A reasonable fit of r; 



plaq 



also be found in the form (^|) with z^p — 6 if the 
heaviest quark mass point is omitted. 

Another interesting quantity is the pion corre- 
lator and/or the pion mass. For instance, the cost 
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for the pion correlator at timeslice distance d = 5 
is shown together with two power fits in figure 
The fit to all points gives a power similar to z v \ aq : 

c 5 = 2.68(35) , z 5 = -1.96(17) . (7) 

However, if the point at M r ~ 3 is omitted from 
the fit we obtain a lower power: 

c 5 = 2.99(22) , z 5 = -1.15(28) . (8) 

Fitting the cost for obtaining the pion mass, the 
result is: 

c mx = 1.99(16) , z m „ = -1.47(16) . (9) 

(The data and the fit is shown in figure |2|) . This 
corresponds to z^ ~ 3 in ([!]), similarly to the 
result for the minimal eigenvalue of the squared 
hermitean fermion matrix: 

c mm = 5.36(80) , z mm = -1.48(25) . (10) 

It is remarkable that the pion mass has a substan- 
tially lower cost than, for instance, the average 
plaquette. In fact, at M r ~ 0.5 the cost ratio is 
already almost 10 in favour of the pion mass. This 
is partly due to the intrinsic fluctuation present 
in the pion propagator which is originated from 
the freedom of randomly choosing the position of 
the source. 

2.4. Analysis with correction factors 

The importance of the final correction step by 
the reweighting factors depends on the choice of 
the second polynomial P^ 2 \ For good enough 
P^ 2 ) the reweighting is negligible. In Q this has 
been the case for all but one runs. (The run where 
the reweighting was not completely negligible has 
the index (i).) Of course, at very small quark 

(2) 

masses Pn 2 becomes good enough only for a very 
large order ri2- In such a case it is a question 
whether it would be more advantageous to keep 
7i2 smaller and tolerate a non-trivial reweighting. 

The consequences of a poor P^ at small quark 
masses could, however, be rather unpleasant. An 
example has been shown in in run (i) at 
M r ~ 0.6 one of the four parallel runs had a seri- 
ous fluctuation with exceptionally small eigenval- 
ues and, as a consequence, very small correction 
factors in a substantial part of the run (see fig- 
ure 2 in ||). In this part of the run also a few 



Quark mass dependence of pion-correlator (d=5) autocorrelations 
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fit: c'xf 

fit1:c1*x z1 


c= 2.68(35) 
z = -1.96(1 7) 
X 2 'ndf = 2.5 




d = 2.99(22) 
z1 =-1.15(28) 
r x 2 'ndf = 0.8 





M r = (r„ mf 



Figure 1. Power fit of the quark mass depen- 
dence of the autocorrelation of pion correlator at 
distance 5. 

negative eigenvalues of the fermion matrix have 
been observed. The smallness of the correspond- 
ing reweighting factors implies that with a better 
second polynomial P^ such fluctuations could 
have been suppressed. 

The question is, what is the consequence of 
such a fluctuation on the integrated autocorre- 
lations and on the related magnitude of statisti- 
cal errors. This can be investigated, for instance, 
by considering the linearized deviation of effec- 
tive pion masses which appears in the lineariza- 
tion method for determining the autocorrelations 
of functions of primary averages |icj ] . For an illus- 
tration see the case of the effective mass between 
timeslice distances d — 3 and d = 7 after in- 
cluding the reweighting factors in the above men- 
tionned run of || (figure ||). As this figure shows, 
the exceptionally small eigenvalues and the cor- 
responding small reweighting factors are accom- 
panied by a large downwards fluctuation of the 
effective mass m^' . This implies that in this run 
the average value of m e J* is smaller than in the 
"normal" runs and at the same time its error is 
larger. This pulls down the overall mass estimate: 
the average of all four parallel runs gives for this 
effective mass m e J s = 0.4160(103). For the three 
other runs the result is: m e J f = 0.4298(63). Our 
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Effective mass deviation: 3-7 in run 2 



Quark mass dependence of pion mass autocorrelations 
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fit: c*x z - 






c= 1.99(16) 






z = -1.47(16) 






X 2 /ndf = 1.7 


H 















M r = (r m^ 

Figure 2. Power fit of the quark mass dependence 
of the autocorrelation of the pion mass. 



conclusion is that it is probably better to suppress 
the kind of fluctuations in figure || by sufficiently 
improving 

p(2). 



3. VOLUME DEPENDENCE 

The physical volume of the lattices in || is such 
that strong finite volume effects are not expected. 
For checking this and for observing the behaviour 
of our algorithm in increasing volumes new sim- 
ulations were performed on a 12 3 ■ 24 and a 16 4 
lattices (see (fl2) and (el6) in table [l]). In order 
to hold all the quantities fixed, except for the vol- 
ume, run (fl2) has the same value of j3 and k as 
run (f) and run (el6) the same as run (e). The 
lattice extensions deduced from r /a are collected 
in the last column of table [|. 

When going to a larger lattice one has to adjust 
in TSMB the first polynomial order n\ . Assuming 
no large finite volume effects the interval given by 
e and A remains unchanged. Therefore it is not 
needed to change ni and 713. This is true for the 
runs (e) and (el6). For run (f) the lower bound- 
ary e was not optimal therefore we used a better 
(i.e. smaller) one in run (fl2). Consequently, rii 
and ri3 had to be increased there. 



r 16 lattice 
O' 4 - p = 4.64 
k = 0.197 



average with error - 
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o=33 



-0.2 - 
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« 1. 1 



15000 



25000 



Figure 3. The linearized deviation of the effective 
pion mass calculated from the distance pair 3 and 
7 as a function of the number of update cycles. 

A remarkable feature in table |l| is that on 
the larger lattices the autocorrelation of the pion 
mass becomes even more favourable compared to 
the average plaquette. This has the consequence 
that the cost for an integrated autocorrelation 
length of the pion mass increases substantially 
slower than the number of lattice points. As ex- 
pected only a tiny finite-size effect is observed for 
the physical quantities measured. 

4. DETERMINANT BREAKUP 

Inspired by Jll| (see also we tried 

to improve autocorrelations by "determinant 
breakup". In our case this means, for instance, 
that simulating two flavours separately should be 
more effective than two degenerate flavours to- 
gether. In general, one can write the determinant 
as 

„ N f ( _ N } /N b \ N b 

det(Q) = I det(Q) V , (11) 

with an arbitrary positive integer Nb and use the 
polynomial approximation (||) for the individual 
factors on the right hand side. 

For tests we have chosen run (b) in || where 
M r ~ 3 (close to the strange quark mass). Re- 
sults are shown in table [|. (Note that r™^ are 
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Table 1 

Analysis on larger volumes for run (e) at [3 



4.76, k = 0.190 and run (f) at /3 = 4.80, k = 0.190. 



run 


ni 


n 2 


n 3 


A 


e- 10 4 


kMVM 






L[fm] 


(e) 


44 


360 


380 


3.6 


2.7 


8.50 


4.59(37) TO 13 


1.94(31) -10 13 


2.31(6) 


(el6) 


72 


350 


440 


3.6 


2.7 


12.4 


7.48(1.31) TO 14 


5.02(55) -10 ia 


4.57(9) 


(f) 


44 


360 


380 


3.6 


2.7 


8.48 


7.47(84) -10 ia 


1.76(59) -10 ia 


2.25(4) 


(fl2) 


72 


500 


560 


3.4 


1.36 


12.3 


2.40(41) TO 14 


4.52(82) -lO 13 


3.02(9) 



Table 2 

Run (b) with determinant breakup in two, four and eight pieces. Here j3 
See also B for the other quantities measured at this point. 



5.04, k = 0.174, Vol= 



16. 



Nb (flavours) 


n\ 


n 2 


n 3 


kMVM 


rt7WVM] 


t™[MVM] 


r^[MVM] 


1 (JV> = 2) 


28 


90 


120 


3.22 


1.13(16) -10 b 


1.09(19) -10 b 


3.67(48) -10 b 


2 (Nf = 1 + 1) 


20 


84 


100 


4.45 


6.05(53) -lO 5 


6.45(90) -10 b 


2.36(36) -10 b 


4 (N f =4x|) 


14 


72 


80 


6.20 


1.55(37) -10 b 


1.95(50) -10 b 


3.60(56) -10 b 


8 (Nf = 8 x |) 


10 


64 


80 


9.79 


1.17(29) -10 b 


1.63(39) -10 b 


4.77(58) -10 b 



obtained here from effective masses with the lin- 
earization method for autocorrelations and er- 
rors and therefore are slightly different from ||.) 
Tests at smaller quark masses and/or in larger 
volumes will be done later. 

Choosing some breakup of the flavours we fixed 
n\ by requiring a constant acceptance rate of 50- 
60%. The higher polynomial orders n 2 and n 3 
were fixed by keeping the relative deviation at 
the interval ends roughly constant. We found 
that the autocorrelation measured in update cy- 
cles is reduced when breaking up the determinant 
in more pieces. However the costs per update 
cycles are increasing. It is remarkable that the 
cost is rising less than one would expect by just 
looking at the numbers for n\ and n 2 . The rea- 
son is that the total number of iterations in the 
global quasi heatbath Iq in (g) are roughly con- 
stant, although the number of boson fields NbUi 
is increasing. Putting everything together we find 
that at this simulation point one can gain from 
determinant breakup almost a factor of two when 
splitting up two flavours in two pieces. 



change with increasing volume. As described 
in we used the implicitely restarted Arnoldi 
method to compute a portion of the eigenval- 
ues of the non-hermitean Wilson-Dirac fermion 
matrix Q. Typically we computed the eigenval- 
ues with lower real part of the matrix Q and those 
with smallest modulus of the preconditioned ma- 
trix Q (as defined in Q ) . By use of the analytical 
relations in || we had access to a reasonable part 
of the spectrum of Q around the origin. 

In figure || we plot O(200) eigenvalues from 
10 configurations thermalized at point (f) on the 
8 3 • 16 lattice. The portion of spectrum to which 
we have access is that inside the dashed circle and 
to the left of the dashed vertical line. In figure 
H we plot again O(200) eigenvalues from 10 con- 
figurations thermalized at point (fl2) on 12 3 • 24 
lattice, which has the same parameters as (f) - 
except for the volume. As expected, the lowest 
border of the spectrum does not essentially move, 
even if the density of eigenvalues is higher (and as 
a consequence, the portion of spectrum that we 
can cover is smaller). 



5. EIGENVALUES 

The study of the low lying modes of the Wilson- 
Dirac operator (Q) is interesting both physically 
and algorithmically. It is an interesting ques- 
tion how the qualitative features of the spectrum 



6. CONCLUSION 

In conclusion, the cost increase with the lattice 
volume is quite acceptable because it is close to 
the trivial volume factor. In case of the autocor- 
relation of the pion mass the observed increase 
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Figure 4. Eigenvalues of the non-hermitean 
fermion matrix in run (f). 
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Figure 5. Eigenvalues of the non-hermitean 
fermion matrix in run (f!2) . 



turns out to be even smaller. 

The determinant breakup is an easy and effec- 
tive method to speed up the gauge field update. 
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